PyDigger - unearthing stuff about Python

Found 10 out of 324,963. Showing 10 on page 1. Total pages: 1.

Name	Version	Summary	date
docstrange	1.1.6	Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, JSON, CSV, HTML) with intelligent content extraction and advanced OCR.	2025-09-10 09:27:30
html2text-rs	0.2.5	Convert HTML to markdown or plain text	2025-08-30 16:48:13
wizardhtml	1.0.1	WHATWG-compliant HTML5 toolkit: DFA tokenizer, spec-guided tree builder, DOM, configurable serializer, high-level cleaner, pretty-printer, and HTML to Markdown.	2025-08-29 12:37:55
aspose-html-net	25.8.0	Aspose.HTML for Python via .NET is a powerful API for Python that provides a headless browser functionality, allowing you to work with HTML documents in a variety of ways. With this API, you can easily create new HTML documents or open existing ones from different sources. Once you have the document, you can perform various manipulation operations, such as removing and replacing HTML nodes.	2025-08-27 12:36:52
document-data-extractor	1.0.4	Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPoint, Excel, images, URLs to clean markdown, JSON, HTML locally. Alternative to Unstructured, Docling, Marker, MarkItDown, MinerU, PaddleOCR, Tesseract	2025-07-29 08:25:56
llm-data-converter	2.2.0	Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPoint, Excel, images, URLs to clean markdown, JSON, HTML locally. Alternative to Unstructured, Docling, Marker, MarkItDown, MinerU, PaddleOCR, Tesseract	2025-07-25 13:32:07
rapid-crawl	0.1.0	A powerful Python SDK for web scraping, crawling, and data extraction - inspired by Firecrawl	2025-07-11 12:32:22
webpage2md	1.0.0	Convert HTML files and web pages to Markdown format	2025-02-19 14:01:34
spiderforce4ai	2.6.7	Python wrapper for SpiderForce4AI HTML-to-Markdown conversion service with LLM post-processing	2025-02-16 14:44:55
pyhtml2md	1.6.0	Transform your HTML into clean, easy-to-read markdown with pyhtml2md.	2024-06-01 09:48:25

Found 10 out of 324,963. Showing 10 on page 1. Total pages: 1.

first prev next last